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Abstract 

Let T = {Fi, F2, . . . , Fn\ be a family of n sets on a ground set X, such as a family of balls 
in W^. For every finite measure [i on X, such that the sets of T arc measurable, the classical 

inclusion- exclusion formula asserts that fj,{FiUF2U- • -UF^) = J2i-ii)=^ic[n]i~^y^^^^ l^iCliei ' 
that is, the measure of the union is expressed using measures of various intersections. The 

number of terms in this formula is exponential in n, and a significant amount of research, 
originating in applied areas, has been devoted to constructing simpler formulas for particular 
families J^. We provide the apparently first upper bound valid for an arbitrary J^: we show 
that every system T of n sets with m nonempty fields in the Venn diagram admits an inclusion- 
exclusion formula with m'^^'"^ terms and with ±1 coefficients, and that such a formula can 
be computed in m'^('°s ") expected time. We also construct systems of n sets on n points for 
which every valid inclusion-exclusion formula has the sum of absolute values of the coefficients 
at least n{n^/^). 

1 Introduction 

One of the basic topics in introductory courses of discrete mathematics is the inclusion-exclusion 
principle (also called the sieve formula), which allows one to compute the number of elements of a 
union Fi U F2 U • • • U F„ of n sets from the knowledge of the sizes of all intersections of the Fj's. 
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We will consider a slightly more general setting, where we have a ground set S and a (finite) 
measure fi on S; then the inclusion-exclusion principle asserts that for every collection Fi, F2, . . . , Fn 
of ^-measurable sets, we have 



(Here, as usual, [n] = {1, 2, . . . , n} and |/| denotes the cardinality of the set I.) This principle not 
only plays a fundamental role in various areas of mathematics such as probability theory or combina- 
torics, but it also has important algorithmic applications. For instance, it provides simple methods 
for the computation of volume or surface area of molecules in computational biology |PCG"'"92| 
and underlies, through efficient computation of Mobius transforms |Knu97| Section 4.3.4], the best 
known algorithms for several NP-hard problems including graph /c-coloring |BHK09j . dominating 
set |vRNvD09] . or partial dominating set and set splitting |NvR10j . 

The inclusion-exclusion principle involves a number of summands that is exponential in n, the 
number of sets. In general this cannot be avoided if one wants an exact formula valid for every 
family T = {Fi, F2, . . . , Fn}; see Example 2.3 below for a family for which Equation ([T]) is the only 



solution. Yet, since this is a serious obstacle to efficient uses of inclusion-exclusion, much effort has 
been devoted to finding "smaller" formulas. These efforts essentially organize along two lines of 
research. 

The first approach gives up on exactness and tries to approximate efficiently the measure of 
the union using the measure of only some of the intersections. The first results of this flavor 
are the classical Bonferroni inequalities |Bon3 6] It turns out that better approximations can be 
obtained by replacing the coefficients (—1)1^1+^ by other suitable numbers, and such Bonferroni-type 
inequalities have been studied extensively; see, e.g., |Gal96j . Linial and Nisan ^LN90j and Kahn et 
al. |KLS96] investigated how well /^(-Fi U • • • U -Fn) can be approximated if we know the measure 
of all intersections Hie/ ^ — b^] °f size at most r. Their main finding is that having r at 

least of order y/n is both necessary and sufficient for a reasonable approximation in the worst case. 
This still leaves us with about 2^ terms in approximate inclusion-exclusion formulas. 

The second line of research looks for "small" inclusion-exclusion formulas valid for specific 
families of sets. To illustrate the type of simplifications afforded by fixing the sets, consider the 
family J" = {Fi, F2, F3} of Figure [l} Since Fi n F3 = Fi n F2 n F3, Formula ^ can be simplified to 

fi (Fi U F2 U F3) = /i(Fi) + /x(F2) + /i(F3) - /i(Fi n F2) - fi{F2 n F3). 

More generally, let us consider a family T = {Fi, F2, . . . , and let us say that a coefficient 
vector 

a = (a/)0^/c[n] e 

is an IE-vector for T if we have 



\]FA= E «/Wn^o (2) 

j=l ^ 7:0^/C[n] He/ 



^These assert that if we omit all terms with |/| > r on the right-hand side of ([l]), then we get an upper bound for 
the left-hand side for r odd, and a lower bound for the left-hand side for r even. The case r = 1 is the often-used 
union bound in probability theory. 
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Figure 1: Three subsets of M? admitting a simpler inclusion-exclusion formula. The ground set 
FiL) F2L) Fs splits into six nonempty regions recognizable by the filling pattern. 



for every finite measure ^ on the ground set of J-' (with all the -Fj's measurable). Given J^, we would 
like to find an IE- vector for J^, such that both the number of nonzero coefficients is small, and the 
coefficients themselves are not too large. This idea, which we originally learned from |AE07j , seems 
to originate in the work of Kratky jKra78| on families of disks in the plane, and a systematic 
study of such simplifications was initiated by Naiman and Wynn [NW921 INW97j . We refer to the 
monograph of Dohmen ^DohOS] for an overview of this line of research. 

Given a specific family = {Fi,F2, . . . ,Fn} of sets, how small can we expect an inclusion- 
exclusion formula to be? This is, roughly speaking, the question we tackle in this paper. To 
formalize the problem, we should first specify how is given. First let us consider the Venn 
diagram of J-', which is the partition of the ground set S into equivalence classes according to the 
membership in the sets of Namely, for each nonempty index set r C [n], we define the region 
of T, denoted by reg(T), as the set of all points that belong to the sets Fi with i £ t and no others 
(see Figure [1]); that is, 

veg{T)=(f]F)\([jFX 
The Venn diagram of J- is then the collection of all subsets of [n] with non-empty regions; that 

is, 

V = V(J-) := {r C [n] : reg(r) / 0}. 

Formally, we thus regard the Venn diagram as a set system on the ground set [n] ; it can be regarded 
as some kind of "dual" of the set system J^. 

It is easy to see that, as far as inclusion-exclusion formulas are concerned, all points in a single 
region are equivalent; it only matters which of the regions are nonempty. Thus, in order to simplify 
our formulations, we can assume that T is standardized, meaning that the ground set equals the 
union of the Fj's and each nonempty region has exactly one point. From an algorithmic point of 
view, this amounts to a preprocessing step for J^, in which the part of the ground set S in each 
nonempty region is contracted to a single point. 

Let F' = {Fi, F2, . . . , Fn} be a family of sets and let m denote the size of V (which equals the 
size of the ground set for F standardized). A linear-algebraic argument shows that every (finite) 



family F has an inclusion-exclusion formula with at most m terms (see Corollary 2.4) and m terms 
are sometimes necessary (see the beginning of Section |4]) . The question of how small a formula F 
admits may thus seem settled. There is, however, a caveat: this linear-algebraic argument may 



yield exponentially large coefficients (see Example 2.5). If we wanted to use such a formula, we 
would need to compute with very high precision, and perhaps more seriously, we would have to 
know the measures of the various intersections with an enormous precision, in order to obtain a 
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meaningful result. This may be totally impractical, e.g., in geometric settings where some physical 
measurements are involved, or where the measures of the intersections are computed with limited 
precision. Thus, we prefer inclusion-exclusion formulas where not only the number of terms is 
small, but the coefficients are also small. 

Our main result is the following general upper bound; to our knowledge, it is the first upper 
bound applicable for an arbitrary family. 

Theorem 1.1. Let n and ni be integers and let D = [2elnm][l + In j^^] . Then for every 
family T of n sets with Venn diagram of size m, there is an IE-vector a for T that has at most 
(^) — nT-^^^"^ nonzero coefficients, and in which all nonzero coefficients are ±1 's. Such an 
a can be computed in m^^^^ expected time if J- is standardized. 

The bound in this theorem is pseudo-polynomial, but not polynomial, in m and n. We do not 
know if a polynomial bound can be achieved, for example, with ±1 coefficients. We have at least 
the following lower bound, proved in Section [4j showing that inclusion-exclusion formulas of linear 
size are impossible in general. 

Theorem 1.2. For infinitely many values of n, there are families of n sets on n points, for which 
every IE-vector has ii-norm at least 

We recall that the ^i-norm of a real vector x G is ||a;||i = X^iLi l^il- The ^i-norm gives a lower 
bound on the tradeoff between the number of nonzero coefficients and their orders of magnitude 
(we recall that a formula with 0{m) nonzero coefficients is always attainable, the problem being 
that the coefficients may be too large). 

Remark on £i-norm minimization. A useful heuristic for finding "small" IE-vectors might 
be to look for an IE-vector of minimum ^i-norm. In the linear- algebraic formulation, this means 
finding a solution of Ax = 1 of minimum ^i-norm. 

It is well known that finding a solution of minimum ^i-norm of a linear system can be done 
in polynomial time, via linear programming. Several specialized algorithms for this problem have 
also been developed, with better performance than direct application of general-purpose LP solvers 
(see, e.g., |YGZ+10] for a recent overview). However, in our setting the number of columns of 
the matrix A may be exponential in m and n, and so even the input for an £i-norm minimizing 
algorithm would be too large. 

There are linear programs with exponentially many variables (and polynomially many con- 
straints) that can still be solved in polynomial time. For example, one may attempt, at least for 
theoretical purposes, to solve the dual linear program by the ellipsoid method, provided that a 
separation oracle is available. 

In our setting, the task of the separation oracle can be formulated as follows in the setting of 
the original (standardized) set system T = {Fi, . . . , Fn}: Given weights wi, . . . , Wm £ of the 
points and threshold c, find a subset I CI [n], if one exists, such that the sum of weights of the 
points in d^^j Fi is at least c. Unfortunately, as was shown by Hoffmann et al. |H0R^12j . this 
problem is NP-complete not only for arbitrary set systems, but also, e.g., for the case where each 
Fi is the complement of a hexagon in the plane. Thus, this approach doesn't seem to lead to a 
polynomial-time algorithm for finding an IE-vector of minimum ^i-norm even for rather simple 
geometric settings. 
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2 Preliminaries 



As in the introduction, we consider a family T = {Fi,F2, . . . ,-Fn} of sets on a ground set S, and 
we assume that the Fi are all distinct. Besides the Venn diagram V, we associate yet another set 
system with J^, namely, the nerve N of F: 



M = MiF) := L C [n] : a / 0, / 0|. 



So both of Af and V have ground set [n] , and we have V C AA. 

Let us enumerate the elements of V as V = {ti,T2, ■ ■ ■ ,Tm} in such a way that |Tj| < |rj| for 
i < j, and let us enumerate = {o"i, (T2, . . . , cr^j^^} so that the sets of V come first, i.e., = Tj for 
i = 1,2, . . . ,m. 

In the introduction, we were indexing IE- vectors for F by all possible subsets / C [n]. But if / 
is not in the nerve, the corresponding intersection is empty, and thus w.l.o.g. we may assume that 
its coefficient is zero. Thus, from now on, we will index IE- vectors x as (xi, . . . where xj is 

the coefficient of /^(fljeo-j ^i)- 

IE- vectors from linear algebra. Let A = {ajk) denote the 0-1 matrix with m rows and \J\f\ 
columns such that ajk = 1 if Tj 5 ak and ajk = otherwise. Let 1 denote the m-dimensional vector 
with all entries equal to 1. 

Lemma 2.1. x G RI-'^I is an IE-vector for F if and only if Ax = 1. 



Proof. A vector cc G Rl-^l is an IE- vector for F if and only if for every finite measure on. S we 
have 

\j=i / k=i yieo-fe J 

We first reformulate Equation ^ using the regions of F. The regions decompose UILi ™ ^ 
way that is compatible with the regions rijeo- ^i- 

n 

IJ Fj = IJ reg(r) and for aU a £j\f, f]Fi= [J reg(r). 

i=l rgV iGcT TeV:rDcr 

Moreover, the regions are pairwise disjoint. Thus, for every finite measure fi on S we have 
^^[\J^ij =^f^i''^s{r)) and for ah a e A", ^|p|FiJ= ^ ^(reg(r)), 

\i=l ) reV Viecr / reV: rDcr 

and Equation ([s]) is equivalent to 

lA^I / \ 
X] (i"eg(r)) = X] X ^ (^eg(T)) . 

tSV fc=l VreV: T3(7fc / 
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Using the orderings on V and and the definition of A we obtain that x G rI-'^I is an IE-vector 
for if and only if for every finite measure /i on 5 we have 

m \Af\ / m \ m /W \ 

fi (reg(r,)) = X] H "j-'^^ (reg(Tj)) = X] X] "j^^'^ ^^ (reg(rj)) . (4) 
j=i k=i \j=i J j=i \k=i J 

Now, if Ax = 1 then Equation Q trivially holds for all fj, and x is an IE-vector for 
Conversely, assume that x is an IE- vector for and thus that Equation Q holds for all fi. For 
1 < j < m we pick pj G reg(Tj) and define the measure /Xj : 2'^ — )• M by fij{T) = 1 if G T and 
otherwise. Equation Q then specializes to 

1 = fij (reg(Tj)) = ^XykOj-fc/ij (reg(r,)) = ^Oj-fcXfc. 

fe=i fc=i 

This implies that {Ax)j = 1. The statement follows. □ 

Remark 2.2. In our definition a vector x is an IE- vector for T if and only if Equation ^ is valid 
for every finite measure. As it follows from the proof of Lemma |2.1| this definition is equivalent to 
extending this requirement to every (finitely additive) signed measure. (A signed measure satisfies 
the classical axioms of a measure with the exception that it may take negative values.) 

Example 2.3. Let S = 2^ \ {[n]} and Fi = 2W\{i} for i G [n]. It is easy to see that here J\f = V 
and ^ is a lower-triangular square matrix with I's on the diagonal. Hence A is invertible and, 



by Lemma 2.1, has a unique IE- vector, namely, the one from the standard inclusion-exclusion 
formula. 

Corollary 2.4. For every finite family T , there is a unique IE-vector cx supported on V (that is, 
such that aj = for I ^V), and this cx has all entries integral. 

Proof. Let B be the m x m submatrix of A consisting of the first m columns of A. The IE- vectors 
for J- supported on V are in one-to-one correspondence with the solutions of By = 1. Since B is 
lower-triangular and has I's on the main diagonal, it is nonsingular, and hence By = 1 has exactly 
one solution. Moreover, since i? is a lower-triangular 0-1 matrix, this solution is integral. □ 

Unfortunately, the IE- vector with small support given by Corollary |2 . 4| might have exponentially 
large coefficients, as the following example shows. 

Example 2.5. Let S = [51] for some positive integer i, and for i < i, let g{i) stand for the smallest 
integer j > i divisible by 5; that is g{i) = 5\i/5] . We consider the set system F = {Fi, F2, . . . , F^i} 
on S given by Fi = {i}\j{g{i) + l, . . . , 5^}. Now j G Fi if and only iii = j or j > g{i). In particular, 
no two elements of S belong to the same region and the number of regions of is m = |5| = 5^, 
which is also equal to the number n of sets in J^: n = m = 5i. The lower-triangular matrix B 



from the proof of Corollary 2.4 has a simple structure in terms of 5 x 5 blocks: the blocks on the 
diagonal are identity blocks, and the blocks below the diagonal are filled with I's. Let x denote 
the solution of Bx = 1. The first five rows yield xi = X2 = • • • = £5 = 1. The next five rows imply 
that for j = 6, 7, . . . , 10 we have 

Xl+ X2-\ \- X5 + Xj = 1, 
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and so xq = x-r = ■ ■ ■ = xio = —4. A simple induction yields Xi = (— 4)(5(*)/5)-i_ Altogether, the 
largest coefficient is of order 4""/^. (Replacing the constant 5 by another constant y yields a similar 
exponential growth with basis {y — 1)^^^; the choice y = 5 maximizes the basis of the exponent.) 

Abstract tubes. While studying possible simplifications of inclusion-exclusion formulas, Naiman 
and Wynn |NW921 INW97j started from families T = {Fi, F2, . . . , Fn} that were tube-hke in the 
sense that Fi n Fj C F^ for all i < k < j (as in our Figure [T]). The simplifications identified for 
"simple tubes" hold in a broader setting, leading Naiman and Wynn to introduce the more general 
notion of an abstract tube. This notion will also play an important role in our considerations. 

Definition 2.6. An (abstract) simplicial complex with vertex set [n] is a hereditary system of 
nonempty subsets of [n] An abstract tube is a pair (F, /C), where F = {Fi, F2, . . . , F„} is a family 
of sets and /C is a simplicial complex with vertex set [n], such that for every nonempty region 
T of the Venn diagram of F, the subcomplex induced on /C by r, /C[r] :={'&£ fC: 'd CI r}, is 
contractiblel3 

As first noted by Naiman and Wynn |NW921 INW97j . if {F, /C) is an abstract tube, then 

/^(u^)=E(-i)'"^v(n^O- 

Moreover, truncating the sum yields upper and lower bounds in the spirit of the Bonferroni in- 
equalities |Doh03l Theorem 3.1.9]. 

Remark 2.7. An earlier, more permissive definition of abstract tubes by |NW92j had the weaker 
condition "x(^ [''"]) = 1" instead of "/C[r] contractible," where x is the Euler characteristic. We 
recall that for a simplicial complex C in our sense, we have x(^) = Eaez;(-1)''^'"^^- Using this 
definition and Lemma 2.1 the proof of ([s]) can be given in few lines. Indeed, consider a simplicial 



complex /C with vertex set [n] and let x E rI-'^I stand for the vector with Xk = (— l)l°'''l+^ if Cfc G /C 
and Xfc = otherwise. Since 

we have {Ax)j = x(A^[Tj]). Thus, if all the /C[tj] have Euler characteristic 1, then x is an IE-vector, 
and ([5]) follows. 

The stronger definition of abstract tubes involving contractibility was needed in order to guaran- 
tee that truncations of Equation ([s]) also yield Bonferroni-type inequalities |Doh031 Theorem 3.1.9]. 

Small abstract tubes have been identified for families of balls |NW92[ INW971 IAE07| or half- 
spaces |NW97j in M"^, and similar structures were found for families of pseudodisks |ER97j . We 
establish Theorem |1.1| by proving that for every family of sets there exists an abstract tube with 
"small" size that, in addition, can be computed efficiently. We will use the following sufficient 



^We emphasize that we exclude an empty set from the definition of a simplicial complex. This is non-standard 
definition; however, it is convenient for our purposes. 

^By contractible we mean contractibility in the sense of topology; roughly speaking, the topological space defined 
by IC[t] can be continuously shrunk to a point. Readers not at ease with this notion may want to look at the remark 
few lines below. 
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condition guaranteeing that {J-,fC) is an abstract tube; it is a reformulation of |Doh03l Theo- 
rem 4.2.5](for the reader's convenience we include a simple proof). Let MNF(/C) denote the system 
of all inclusion- minimal non-faces of /C, i.e., of all nonempty sets / C [n] with / /C but with /' G /C 
for every proper subset I' C /. 

Proposition 2.8. Let J- = {Fi,F2, . . . ,Fn} be a family of sets with Venn diagram V and let K, 
he a simplicial complex with vertex set [n] . If no set of V can he expressed as a union of sets in 
MNF(/C), then /C) is an abstract tube. 

Proof. Let r G V and let a G r such that a belongs to no element of MNF(/C) contained in r. For 
every simplex i9 £ /C[r], we have i} U {a} G /C[t]. If U {a} ^ )C[t], then 'd U {a} contains some 
(3 G MNF(/C); as i? G /C[t], it must be that /3 contains a, a contradiction. Thus U {a} G /C[t] for 
every ■& G /C[r]. In other words, /C[r] is a cone with apex a. Since every cone is contractible, the 
statement follows. □ 



3 The upper bound: proof of Theorem 1.1 



Abstract tubes from selectors. Let T = {Fi, F2, . . . , F„} be a family of sets, and let V be the 
Venn diagram of We recall that a selector for V is a map w: V ^ [n] such that w{t) G t for 
every r G V. We observe that each selector for V provides an abstract tube for (which satisfies 



the sufficient condition of Proposition 2.8). 



Lemma 3.1. Let T = {Fi, F2, ■ ■ ■ , Fn}, V = V{F), and let w be a selector for V. We define the 
simplicial complex 

K-w = {o" G M{F) : for all nonempty t9 C cr there is t G V such that w{t) G -d CI r}. 
Then {F,ICw) is an abstract tube. 

Proof. This is simple once the idea behind the definition of ICy^ is explained. Namely, in the 



condition of Proposition 2.8 we want to prevent each set r G V from being a union of minimal 
non- faces of the simplicial complex /C. Our way of achieving that is to insist that every minimal 
non-face / contained in r avoids the point w{t); thus, we consider the set system of "admissible 
minimal non-faces" 

:= {/ ^ N,/ : if / C r G V, then w{t) ^ I}. 

Then the above definition of ICw can be interpreted as follows: a simplex a £ M belongs to ICw 
if it contains no / G (Simplices outside J\f can be ignored, since their supersets cannot be 

contained in a set r G V.) Therefore, all minimal non-faces of ICw belong to Bw or lie outside J\f, 



and hence {T,)Cw) is an abstract tube by Proposition 2.8, □ 



Let us remark that there is no loss of generality in passing from the abstract tubes as in 



Proposition |2.8| to those of the form JC^. Indeed, if /C satisfies the condition of Proposition |2.8 
then every r G V contains at least one point that is not contained in any minimal non-face / of /C 
with / ^ T, and such a point can be chosen as w{t) — then we can easily check that /C^ C /C. 



^Note that for the formal verification, the condition a contains no / £ Bw can be written, in symbols, as follows: 
V/ C [n], / / : ((Vr G V : 7 C r ^ w(r) ( I) ^ I <Z a). This is equivalent to VJ C [n], 7 / : 7 C cr ^ (3r G V : / C 
r A 'w{t) G 7) which is just a transcription of cr G ICm- 
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Figure 2: Illustration for Lemma 3.2 
contain a row compatible with {i^, 



If /Cp contains the simplex {ii, ^2, . . . , is}, then Tp must 



+1, . . . , is} for s = 1, 2, . . . , 5. The js row is emphasized, 
constrained values appearing in grey; rows js for other values of s are represented consecutively for 
clarity, but they can appear in any order and non-consecutively. 



Large simplices in random /C«,. Let p be a permutation of [n]. We define a selector Wp for V 
by taking w{t) as the smallest element of r in the linear ordering -< on [n] given by p{l) -< p{2) -< 
< p(n). 

For better readability we write ICp instead of /C^^. We want to show that for random p, ICp is 
unlikely to contain too large simplices, and thus leads to a small inclusion-exclusion formula. 

Let r denote the incidence matrix of V, that is, the 0-1 matrix with m rows and n columns 
where Tij = 1 if and only if j G (if the original system was standardized, then F is the 
transposition of the usual incidence matrix of We also denote by Tp the matrix obtained by 
applying the permutation p to the columns of F: the p{i)th. column of Tp is the ith column of F 
and represents the incidences between permuted [n] and V. We now argue that if fCp contains a 
large simplex, then Tp contains a particular substructure. 

We say that a row R of Tp is compatible with a subset / C [n] if R contains I's in all columns 
with index in / and O's in all columns with index smaller than min(/). 

Lemma 3.2. // /j(t) = {ii, ^2, • • • , ik} for a simplex r in ICp, with ii < i2 < ■ ■ ■ < ik, then for every 
s £ {1, 2, . . . , A;} the matrix Tp contains a row compatible with {ig, is+i, • • . , ik}- 

Proof. Let s G {1, 2, . . . , k}, let Ig = {is, is+i, ■ ■ ■ , ik}, and let t?^ = P^^ih)- We refer to Figure [2j 



Since -ds is a simplex of )Cp, there exists tj^ S V such that Wp{TjJ S t?^ C tj^ by Lemma 3.1 Since 
^ Tj^, we have Is = pi'&s) ^ p{'Tjs)-> and hence the jsth row of Tp has I's in all columns with 
index in Is- Since Wp^TjJ G t?^, the set p{TjJ contains no i with i < is and the j^th row of Tp 
has O's in all columns with index smaller than ig = inm{Is). It follows that the jsth row of Tp is 
compatible with Ig. □ 

We will need the following inequality. 
Lemma 3.3. Let xi, . . . ,Xr be positive real numbers with xi + ■ ■ ■ + Xr < n. Then 

Xi X2 Xr-^i 



<(1 

X ]^ I • • • I Xy X 2 I * * * I X y Xf — \ I Xff 
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Proof. Let us set y£ := Xi + x^+i + ■ ■ ■ + Xr- Then we have 

xi X2 Xr-i _ yi - y2 y2 - ys Vr-i - Vr 

Xl^ h X2-\ VXr Xr-1 + Xr yi y2 Vr-l 

i-^Vfi-^V-Yi-^ 



< 



< 1 



yij \ y2j V Vr-i. 
1 - y2/yi + 1 - j/3/1/2 H hi- yr/vr-i ^ 

r — 1 

^ _ y2/yi +y3/y2-\ Vyr/yr-i ^ ''^^ 

r — 1 

■r-1 



< 1 




□ 



Now we aim at showing that for a random p, the condition in Lemma 3.2 is unhkely to be 
satisfied for large k. That condition prescribes the existence of k rows in Tp with a certain pattern. 
In order to get a good bound for fc, we won't actuaUy look for all of these k rows, but rather we 
will consider only each 6th of them, for a suitable integer parameter 6, and ignore the rest. 

Namely, we fix two parameters r and b with 1 < 6 < n and set k = rh (we think of r ~ In n and 
h ~ Inm). For an r-element index set J C [m], let Tp[J\ denote the submatrix obtained from Tp 
by considering only the rows with indices in J. We say that a permutation p is had for J if there 
is a A;-element set of column indices / = {«i, «2) • • • j ^fc} with ii < i2 < ■ ■ ■ < ik such that for every 
s G {1, 6 + 1, . . . , (r — 1)6 + 1}, the matrix Tp[J] contains a row compatible with {is,is+i, ■ ■ ■ , ik}- 
Finally, we define pj as the probability that a random permutation p is bad for J. 

Lemma 3.4. We have pj < {I - (6/n)i/('^-i))*('-^). 

Proof. Let p be a bad permutation for J, and let / = {is,is+i, • • • , ik} be the corresponding set of 
column indices. 

Let i £ {0, 1, . . . ,r — 1}. There are r — i rows of Tp[J] that are compatible with a subset of 
{i£.b+i,ie-b+2T ■ ■ lik}- It follows that for i < i^.b+i, the ith column of Tp[J] contains at most i 
entries 1. Moreover, for i € {ii.h+i,ii-b+2-, ■ ■ ■ ,i{£+i)-b}i the ith column of TplJ] contains exactly i 
entries 1, since every row compatible with {ig, is+i, • • • 1 ik} for s < i ■ b + 1 has an entry 1 in these 
columns. 

We now partition [n] into [n] = Qq L) Qi L) . . . L) Qr, where Qi consists of the indices of those 
columns of T[J] that contain exactly £ entries 1 (and r — £ entries 0). For £ G [r] and p G [6], let 
g^^^ denote the pth smallest element of p{Qe). A necessary condition on p is 

9?<9i''<9?<9i'U...<A<d'l 

For £ ^ [r], let denote the event Ei := {gf^ < m.m{g^^-^^, g^^2J ■ ■ ■ T9r^^)}j we bound pj by 
the conditional probability 

PJ < Prob(^i) • Prob(£;2|£^i) • Prob(£;3|£;i n ^2) • • • Pvoh{Er-i\Ei n • • • n Er-2). (6) 
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For £ £ [r — 1], Picoh{E£\Ei n • • • H Ei_i) is the probability that the b smahest elements of p{Qe) U 
p{Qi+i) U • • • U p{Qr) belong to p{Qt). This probability is equal to 

Qi\\ / (\Qt\ + \Qt+i\^ — \Qr\\ ^ ( \Qi 



b J / \ b J - \\Qe\ + \Qi+i\ + ---\Qr 

So, letting X£ = \Qi\, Inequality ^ implies 



PJ < 



Xi X2 Xr-l 



Xi + X2 + ■ ■ ■ + Xr X2+ X2, + ■ ■ ■ + Xr 




b(r-l) 



the last inequality being Lemma 3.3, Then the lemma follows using \Qr\'>b. □ 



Proof of Theorem \l. 1\ Let n and m > 3 be integers. Let J- = {Fi, F2, . . . , Fn} be a family of n 
sets whose Venn diagram V has size m. Let p{k) denote the probability that ICp contains at least 
one simplex of size k, where p is chosen uniformly at random among all permutations of [n]. From 



Lemmas 3.2 and 3.4, for every r > 2 and 6 > 2 we have 



Assuming that b > 2elnm, we get p{rb) < rrf~'^^^''~^'^ g^j^d choosing r > 1 + In ^, we obtain 

I 1 — 1 2 1 

''~'\Jb/n = e 1 6 > e~ and p{rb) < m ~^ < 
Thus, with D = [2elnm] [1 + In p^] as in the theorem, we have p{D) < I, and so there exists 



3.1 



a permutation p* of [n] such that ICp* contains no simplex of size D (or larger). By Lemma 
{T, Kp*) is an abstract tube and KLp* has at most Yld=i (^) simplices. The IE- vector obtained from 
the abstract tube {F, Kp* ) as in Equation ([5| is as claimed in the theorem. 

In order to actually compute a suitable coefficient vector, we choose a random permutation p 
and compute fCp by the following incremental algorithm. We use two auxiliary set systems A and 
B, initialized to ^ = ;S = {0} (the idea is that B contains all the simplices of tCp found so far, and 
A^B contains those for which we still need to test one-element extensions). In each step, we take 
some a £ A, remove it from A^ and for each i ^ a, we test whether a U {i} belongs to ICp (for this, 
we just check if there is r S V such that Wp{T) € o"U {1} C r). Those fiU {i} that pass this test are 
added to both A and B. The algorithm finishes either when ^ = (in this case we set ICp = B\ {0} 
and return the corresponding IE- vector), or when we first discover a simplex a G ICp of size larger 
than D. In the latter case, we discard the current permutation p, choose a new one, and repeat 
the algorithm. 

The choice of a random permutation p takes O(nlnn) time and n random bits. Accepting or 
rejecting a new simplex by brute- force testing takes 0{mn) time. The expected number of times 
we have to start over with a new permutation p is 0{\). Altogether, the expected running time of 
this algorithm is O {{^fnn) = mP^^'^ "\ □ 
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4 The lower bound: proof of Theorem 



1.2 



For any m between n and 2" there exists a system of n sets with Venn diagram of size m and whose 
only IE-vector has m nonzero entries. Indeed, let K = {'&i,'d2, ■ ■ ■ ,^m} be a simplicial complex 
over [n] such that [n] = [JK and \K\ = m. We define Fi = {t £ [m]: i £ ■dt} for 1 < i < n 
and put T = {Fi,F2, . . . , It can easily be checked that V(-F) = = K so, as observed 

in Example |2.3[ the matrix A is square, lower-triangular, and has I's on the diagonal; there is 
therefore a unique IE- vector for T and it has m nonzero entries. In this section we improve this 
lower-bound. 

We recall that by Corollary |2.4[ every set system has a unique unique IE- vector with support 
in the Venn diagram V{J-). This leads to the following observation. 

Lemma 4.1. Let T he a finite family of sets with Venn diagram V. // V U {[n]}, considered as a 
poset with respect to the inclusion relation, is a join-semilattice^ then among all IE-vectors for T , 
the one with support in V has minimal li-norm. 

Proof. Let A be the matrix with rows indexed by V and columns indexed hy N = A/'( J^) , as defined 



before Lemma 2.1 , and let B be the m x m submatrix consisting of the first m columns of A. 

We want to show that every column of A is also a column of B. By the definition of A, this means 
that for every a G Af we need to find some u £ V such that {r G V : a C r} = {r G V : C r}. 
It is easily seen that the join of all r G V with r C a is such a f (we note that u G V, since all 
inclusion- maximal elements of M are in V). 

Hence every column of A occurs in B as asserted. It follows that every solution of Ax = 1 can 
be transformed to a solution of By = 1 with the same or smaller -^i-norm (if k is the index of a 
column outside B with Xk ^ 0, and that fcth column equals the jth column of B, then we can zero 
out Xk while replacing xj with xj + Xk)- Since By = 1 has a unique solution, it has to be a solution 
of minimum £i-norm as claimed. □ 



We can now show that for arbitrary large n there exist families of n sets with Venn diagrams 

2J 



of size n for which any IE- vector has ^i-norm at least (V^)^^'^ 



Proof of Theorem 1.2 



Let s = + q + 1, where g is a power of a prime number. Let {P,C-) be a 
finite projective plan^of order and let us put S = P U C. 

We number the elements of P arbitrarily as P = {pi,P2, ■ ■ ■ ,Ps}, and similarly C = {ii,i2, ■ ■ ■ ^^s} 
For i G [s] we set Fi = {ii}l->{pj ■ Pj G that is the line ii together with all the points it contains, 
and Fj+s = {pi}. Our set system is J-" = {Fi, . . . , n = 2s. 

To describe the Venn diagram V{J-), we note that each line ii £ C is contained only in Fi, 
while each point pi £ P is contained in Fi^g and in every Fi' with pi G £i'. Therefore, V consists of 
Ti = {i} and n+s = {i + s}U{i' : pi G ii'}, i G [s]. In particular, m = |V| = 2s = n. It is easy to find 
the unique IE- vector for J- with support in V: the nonzero components are = 1 for k = 1,2, ... ,s 
and ak = -qfor k = s + l,s + 2,...,2s. Thus ||q;||i = s{q + 1) > {q^ + q + lf/^ = (f)^^^ 



^This means that for every ri, . . . , G V there is the join t = ViLi ''"j G ^ U {[n]}, meaning that all Ti C r, and 
also T (Z t' whenever r' G V U {[n]} contains all of ri, . . . , rj;. 

® A finite projective plane of order g is a pair of sets (P, C) where P is a set of + 5 + 1 points and £ C 2^ is a 
set of g'^ + g + 1 lines such that every line contains g + 1 points, every point is in g + 1 lines, every two lines intersect 
in a single point and every two points are contained in exactly one line. It is well known that a projective plane of 
order g exists whenever g is a power of a prime number. 
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4.1 



It remains to check that V U {[n]} is a join-semilattice; then Theorem 1.2 wih follow from 
Since V is finite, it is enough to verify that the join Tj VTj exists for every two i,j £ [n], 
{i,j}, and this is contained only in [n] and in Tk+s, where pk is the 
point of intersection of ii and ij. Therefore, Tj V tj = Tk+s- If « < s and j > s, then either Tj C tj 
(which implies Tj V Tj = tj ) or Tj U tj has at least q + 3 elements (which implies Tj V tj = [n] since 
V contains only sets of size at most q + 2). Finally, if i,j > s, then Tj U tj again has at least q + 3 



Lemma 
i < j- If i, j < s, then Tj U Tj 



elements, implying Tj V Tj 



n . 



This concludes the proof. 



□ 



5 Open problems 

For several NP-hard problems, the best exponential-time algorithms rely on the inclusion-exclusion 
principle |BHK09| IvRNvDOQl INvRlOj . Whether these algorithms can be improved using Theo- 
rem |1.1| is an open problem that is perhaps best illustrated on an example. 

Consider for instance the question of counting the number of k-covers: given a family X = 
{Xi,X2, ■ ■ ■ ,Xp} of subsets of [n], we want to determine how many /c-element subsets of X have 
their union equal to [n]. Bjorklund et al. |BHK09( Section 3.1] proposed the following approach. 
For z G [n], let -Fj denote the set of /c-element subsets of X whose union does not contain i: Fi = 
{{Yi,Y2,...,Yk) eX:ii . . .IJYk] . For a subset a C [n] let av(cj) = {X e X : Xr\a = 0}. 

The number of A;-covers of X can be written, using the inclusion-exclusion principle, as 



\X\ 



1=1 



\x\ 



E (■ 

ZlCcrCfnl 



1 



Ida 



\x\ 



E (■ 



-1 



ikl+i| 



av((T) 



0c<Tcr 



(7) 



Let / denote the indicator function of X and / its Mohius transform: for / C [n], /(/) = X^5C7 f{S) 

(/ is sometimes also called the Zeta transform). Since |av((T)| = Ylsc.[n]\a fi^) ~ /(N V"")' 
can be deduced from the Mobius transform of / by summing its feth powers. 

If /C is a simplicial complex with n vertices and |/C| simplices, then the values of /(o") for all 
a € IC can be computed in 0{n\IC\) time by Yates' algorithm |Knu97l Section 4.3.4]. The above 
method for counting A;-covers therefore runs in time 0(n2"). Simplifying the inclusion-exclusion 
formula Q while keeping its support hereditary, as Theorem |l.l| doe s, irn proves the running time to 
0{ns), where s is the size of the formula (s Theorem 1.1 L When the Venn diagram 

of the -Fj's has size m = 2°^""^ this complexity becomes subexponential in n. However, the catch 
is that, in the above example and many other problems |BHK09l lvRNvD09l INvRlOj . the family 

= {Fi, F2, . . . , Fn} is not standardized, which is a crucial assumption for the computational 
statement in Theorem |l.l[ Whether a simplified formula can be computed efficiently in this context 
is an open problem. 
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